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Near-Optimal Hybrid Processing for Massive 
MIMO Systems via Matrix Decomposition 

Weiheng Ni, Xiaodai Dong, and Wu-Sheng Lu 


Abstract —For the practical implementation of massive 
multiple-input multiple-output (MIMO) systems, the hybrid pro¬ 
cessing (precoding/combining) structure is promising to reduce 
the high cost rendered by large number of RF chains of the tradi¬ 
tional processing structure. The hybrid processing is performed 
through low-dimensional digital baseband processing combined 
with analog RF processing enabled by phase shifters. We pro¬ 
pose to design hybrid RF and baseband precoders/combiners 
for multi-stream transmission in point-to-point massive MIMO 
systems, by directly decomposing the pre-designed unconstrained 
digital precoder/combiner of a large dimension. The constant 
amplitude constraint of analog RF processing results in the 
matrix decomposition problem non-convex. Based on an alternate 
optimization technique, the non-convex matrix decomposition 
problem can be decoupled into a series of convex sub-problems 
and effectively solved by restricting the phase increment of 
each entry in the RF precoder/combiner within a small vicinity 
of its preceding iterate. A singular value decomposition based 
technique is proposed to secure an initial point sufficiently close to 
the global solution of the original non-convex problem. Through 
simulation, the convergence of the alternate optimization for 
such a matrix decomposition based hybrid processing (MD-HP) 
scheme is examined, and the performance of the MD-HP scheme 
is demonstrated to be near-optimal. 

Index Terms —Massive MIMO, hyrbid processing, limited RF 
chains, matrix decomposition, alternate optimization. 

I. Introduction 

Massive multiple-input multiple-output (MIMO) is poten¬ 
tially one of the key technologies to achieve high capacity 
performance in the next generation of mobile cellular sys¬ 
tems ini-Ei. In the limit of an infinite number of anten¬ 
nas, the massive MIMO propagation channel becomes quasi¬ 
static, where the effects of uncorrelated noise and fast fading 
vanish, and such favorable characteristics enables arbitrarily 
small energy per transmitted bit ||2l. Prominently, in massive 
multiuser MIMO systems simple linear processing schemes, 
such as zero-forcing (ZF) and linear minimum mean-square 
error (MMSE), are shown to approach the optimal capacity 
performance achieved by the dirty paper coding in the down¬ 
link communication 0. The spectral efficiency performance 
of massive MIMO systems with several linear processing 
schemes, including ZF, MMSE and maximum-ratio combining 
(MRC), with perfect or imperfect channel state information 
(CSI) has been analyzed in 0. 

Eor practical implementation of massive MIMO systems, 
the number of antennas required for large antenna array gains, 
typically in the order of a hundred or more, is determined 
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by examining the convergence properties over the antenna 
number Q. However, to exploit such a large antenna array 
in massive MIMO systems, the amplitudes and phases of 
the complex transmit symbols are traditionally modified at 
the baseband, and then upconverted to the passband around 
the carrier frequency after passing through radio frequency 
(RE) chains (performing the analog radiowave/digital baseband 
conversion, signal mixing, power amplifying). In this setting, 
all outputs of the RE chains are connected to the antenna 
elements, which means that the number of the RE chains must 
be exactly equal to the number of antenna elements. Under the 
circumstances the fabrication cost and energy consumption of 
such a massive MIMO system become unbearable due to the 
tremendous number of RE chains [Sl . 

To deal with the aforementioned problem, smaller number 
of RE chains are used in the large scale MIMO systems, 
where cost-effective variable phase shifters can be employed to 
handle the mismatch between the number of RE chains and of 
antennas m-M, where high-dimensional analog RE (phase 
only) processing is enabled by using phase shifters while digi¬ 
tal baseband processing is performed in a very low dimension. 
In 191, both diversity and multiplexing transmissions of MIMO 
communications are addressed with a limited number of RE 
chains. In cni and O, analog RE precoding is presented 
to achieve full diversity order and near-optimal beamforming 
performance. Reference ifT^ applies phase-only RE precoding 
to massive MIMO systems to maximize the data rate of users 
based on a bi-convex approximation approach. Especially, the 
utilization of small wavelengths of millimeter wave (mmWave) 
makes it possible to build a large antenna array in a compact 
region. This hybrid baseband and RE processing (transmit 
precoding/receive combining) scheme is found particularly 
suitable for mmWave MIMO communications as it effectively 
reduces the excessive cost of RE chains Ha-Ha. Herein, 
hybrid processing is designed to capture “dominant” paths 
in point-to-point (P2P) mmWave channels by choosing RE 
control phases from array response vectors ifia . lfT4l . On the 
other hand, hybrid processing in multiuser mmWave systems 
is investigated in a, na-na, where analog RE processing 
aims to obtain large antenna gains, while baseband processing 
performs in low-dimensional equivalent channels. 

More often than not, CSI is the prerequisite to perform any 
processing at transmitter and receiver, whether it is a type 
of unconstrained high-dimensional baseband processing for 
the traditional design with one antenna element coupled with 
one dedicated RE chain or it is a type of hybrid processing. 
In HD, training sequences and closed-loop sounding vectors 
are designed to estimate a massive multiple-input single- 
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output (MISO) channel through the alignment of transmit 
beamformer with true channel direction. In ifT^ . a compressive 
sensing (CS) based low-rank approximation problem for esti¬ 
mating massive MIMO channel matrix is formulated, and is 
solved by semidefinite programming. Considering the massive 
MIMO channels with limited scattering feature (especially 
when mmWave channels are involved), the parameters of 
paths, such as the angles of departure (AoDs), angles of arrival 
(AoAs) and the corresponding path loss are estimated by 
designing beamforming codebook so as to obtain the pathloss 
of all paths whose AoDs/AoAs are spatially quantized in the 
entire angular domain m, Ea, while performs the 
beamforming in a hybrid processing setting. 

In this paper, we propose to design hybrid RF and baseband 
precoders/combiners for multi-stream transmission in P2P 
massive MIMO systems by directly decomposing the pre¬ 
designed unconstrained digital precoder/combiner of a large 
dimension. This is an approach that has not been attempted in 
the literature. In our design, the analog RF precoder/combiner 
are constrained by the nature of the phase shifters so that the 
amplitudes of all entries of the RF precoder/combiner matrices 
remain constant. Starting with an optimal unconstrained pre¬ 
coder built on a set of right singular vectors (associated with 
the largest singular values) of the channel matrix, our hybrid 
precoders are designed by minimizing the Frobenius norm of 
the matrix of difference between the unconstrained precoding 
matrix and products of the hybird RF and baseband precoding 
matrices, subject to the aforementioned constraints on the 
RF precoder. Technically, solving this matrix decomposition 
problem is rather challenging because it is a highly noncon- 
vex constrained problem involving a fairly large number of 
design parameters. Here we present an alternate optimization 
technique to approach the solution in that the hybrid precoders 
are alternately optimized in a relaxed setting so as to ensure all 
sub-problems involved are convex. We stress that the convex 
relaxation technique utilized here includes not only properly 
grouping design parameters for alternate optimization, but 
also restricting the phase increment of each entry in the RF 
precoder to within a small vicinity of its preceding iterate. 
Under these circumstances, it is critical to start the proposed 
decomposition algorithm with a suitable initial point that is 
sufficiently close to the global solution of the original non- 
convex matrix decomposition problem. To this end a singular- 
value-decomposition (SVD) based technique is proposed to 
secure a satisfactory initial point that with high probability 
allows our decomposition algorithm to yield near-optimal 
hybrid precoders. Concerning the hybrid combiners design, 
a linear MMSE combiner is selected as the unconstrained 
reference matrix for matrix decomposition, and the hybrid RF 
and baseband combiners can be obtained in the same way as 
the hybrid precoder design. 

We remark that the matrix decomposition based hybrid 
processing design scheme, termed as MD-HP, is suited for 
hybrid processing design over any general massive MIMO 
channels as long as the channel matrix is assumed to be 
known. Simulations are presented to examine the convergence 
of the alternate optimization for the MD-HP scheme and 
to demonstrate the near-optimal performance of the MD-HP 


scheme by comparing it to the optimal unconstrained baseband 
processing based on SVD technique. 

II. System Model 

In this section, we introduce the hybrid processing structure 
for P2P massive MIMO systems and the channel models 
considered in this paper. 

A. System Model 

We consider a communication scenario from a transmitter 
with Nt antennas and Mt RF chains to a receiver equipped 
with Nr antennas and Mr RF chains, where Ng data streams 
are supported. The system model of the transceiver is shown 
in Fig. [U To ensure effectiveness of the communication 
driven by the limited number of RF chains, the number of 
the communication streams is constrained to be bounded by 
Ng < Mt < Nt for the transmitter and by Ng < Mr < Nr 
for the receiver. 



Fig. 1. System model of the transceiver with the hybrid processing structure 

The transmitted symbols are processed by a baseband 
precoder of dimension Mt x Ng, then up-converted to the 
RF domain through the Mt RF chains before being precoded 
by an RF precoder F/j of dimension Nt x Mt. Note that 
the baseband precoder Fb enables both amplitude and phase 
modifications, while only phase changes can be realized by 
Ffl as it is implemented by using analog phase shifters. We 
normalize each entry of F/{ to satisfy |F^’'^^ | = where 
denotes the amplitude of the (i,j)-th element of (•). 
Furthermore, to meet the constraint on total transmit power, 
Fb is normalized to satisfy IIFaFsIIf = Ng, where || • ||b 
denotes the Frobenius norm El. 

We assume a narrowband flat fading channel model and the 
received signal is given by 

y = HFBFBS-bn, (1) 

where y G is the received signal vector, s G 

is the signal vector such that E[ss^] = -^In^ where (■)^ 
denotes conjugate transpose, E[.] denotes expectation, Ijy^ is 
the Ng X Ng identity matrix and P is the average transmit 
power. H G channel matrix, normalized as 

E[||Hb||] = NtNr, and n is the vector of i.i.d. CJ\f{0,a'^) 
addictive complex Gaussian noise. To perform the precoding 
and combining, we assume the channel is known at both the 
transmitter and the receiver, thus the processed received signal 
after combining is given by 

y = WfWfHFBFBS + WfWfn, (2) 

where Wb is the Nr x Mr RF combining matrix and Wb 
is the Mr X Ng baseband combining matrix. Since W p is 
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also implemented by the analog phase shifters, all elements 
of Wj’ are constrained to have constant amplitude such that 
= 1 ^- Gaussian inputs are employed at the 
transmitter, the long-term average spectral efficiency achieved 
shall be 


R{Fr,Fb,Wr,Wb) 


log 2 






(3) 


where R„ = ct^WHW^WfW b is the covariance matrix of 
the noise and H = Wf Wf HF^Fs. 


the precoding scheme to be developed in Section |III] can 
directly be applied to arbitrary antenna arrays. For an N- 
element ULA, the array response vector can be given by 




pjiN-l)^dsin{e) 


1 T 


(5) 

where A is the wavelength of the carrier, and d is the dis¬ 
tance between any two adjacent antenna elements. The array 
response vectors at both the transmitter and the receiver can 
be written in the form of ©. 


B. Channel Model 

In this paper, we seek to find optimal hybrid precoders (F/j, 
Fs) as well as hybrid combiners (W/j, Wb) based on a 
general channel matrix H. To measure the performance of our 
MD-HP scheme, we examine two types of channel models in 
simulation studies to be presented in Section IIVI namely 

1) Large Rayleigh fading channel H,.; with i.i.d. 1) 

entries; 

2) Limited scattering mm Wave channel Flmmw 

We remark that several hybrid processing schemes for 
mmWave communications have been studied, where a large 
antenna array is implemented to combat high free-space 
pathloss and reflection loss Ha-Ha. Thus the mm Wave chan¬ 
nel model Flmmw is an appropriate instance for comparing the 
performance of the proposed scheme with related recent work 
in the literature. Because of the limited (sparse) scattering 
characteristic of a mm Wave channel, we decide to introduce 
a clustered mmWave channel model to characterize its key 
features ll22l . The mmWave channel Hmmw is assumed to 
be the sum of all propagation paths that are scattered in 
Nc clusters with each cluster contributing Np paths. Under 
these circumstances, the normalized channel matrix can be 
expressed as 

'H.mmw = J aiiar{Oii)aLt{(j)ii)^, (4) 

V ° P i=i 1=1 

where an is the complex gain of the *-th path in the Lth 
cluster, which follows CN{Q, For the (i, /)-th path. On and 
(pii are the azimuth angles of arrival/departure (AoA/AoD), 
while a.r{9ii) and at{(j)ii) are the receive and transmit array 
response vectors at the azimuth angles of Bn and (pn respec¬ 
tively, and the elevation dimension is ignorecfl Within the z-th 
cluster. Oil and (fin have the uniformly-distributed mean values 
of 6i and cjii respectively, while the lower and upper bounds 
of the uniform distribution for 6i and (pi can be defined as 
[ 6 * 111111 , 0max] and [pmin, The angular spreads (standard 

deviations) of Or and pn among all clusters are assumed to 
be constant, denoted as ag and a^. According to 1(141 . we use 
truncated Laplacian distribution to generate all the Oas and 
(piis based on the above parameters. 

As for the array response vectors arifin) and at{(pii), we 
choose uniform linear arrays (ULAs) in our simulations, while 

*The power gain of the channel matrix is normalized such that 

^Only 2D beamforming is considered in this mmWave channel model. 


III. Hybrid Precoding/Combining Design for A 
General Massive MIMO Channel 

The design of hybrid precoders (Fb, Fb) and combiners 
(Wb, Wb) based on a general massive MIMO channel H 
may be achieved by formulating a joint transmitter-receiver 
optimization problem to maximize the spectral efficiency, 
which is given by 

max i?(FB,F b, Wb, Wb) 
s.L ||FBFB||?. = iV„ (6) 

Fr C -Fb, Wb G Wb, 

where J^b(Wb) is the set of matrices with all constant 
amplitude entries, which is However, this type 

of joint optimization problems is often intractable 1 ( 2 ^ . due 
to the presence of non-convex constraints Fb G Rr and 
WR G Wb that obstruct the regular progress of securing a 
globally optimal solution. Before gaining an insight into the 
solution of this joint optimization problem (O, we introduce 
the optimal unconstrained precoder F* and combiner W* for 
achieving maximum capacity of a general MIMO channel, 
based on which a procedure for the design of near-optimal 
hybrid precoders/combiners is developed. Assume that the 
channel matrix H is well-conditioned to transmit Ng data 
streams, namely, rank(H) > Ng. To obtain the optimal 
F* and W*, we perform the SVD of the channel matrix 
H = USV^, where U and V are x Nr and Nt x Nt 
unitary matrices, respectively, and S is an Nr x Nt diagonal 
matrix with singular values on its diagonal in descendant order. 
Without incorporating the waterfilling power allocation, the 
optimal unconstrained precoder and combiner are given by 

F’^ = Vi, W* = Ui, (7) 


where Vi and Ui are constructed with the first Ng columns 
of V and U, respectively, and the corresponding spectral 
efficiency by using such unconstrained F* and W* is given 

by 


R = log 2 






■S? 


( 8 ) 


where Si represents the first partition of dimension Ng x Ng 
of S by defining that 


S = 


Si 

0 



(9) 


where 7 = ^ is the signal-to-noise ratio (SNR). 

Actually, R sets an upper bound for the spectral efficiency 
i?(FB,F b, Wb, Wb) in problem (jb]) where the ranges of 












4 


the matrix products F/jFb and are respectively 

the subsets of feasible regions of the unconstrained precoder 
and combiner, namely, and Considering 

the non-convex nature of the problem (|6l), it is impractical 
to insist upon securing its global solution. One apparently 
viable approach is to construct hybrid precoders (Ffl;, Fs) and 
combiners (WfljWs) such that the optimal unconstrained 
precoder F* and combiner W* can be sufficiently closely 
approached by FrjFs and WhWb respectively. In what 
follows, the design of such hybrid precoders is substantiated 
via matrix decomposition. 


A. Hybrid Precoders Design via Matrix Decomposition 

Given hybrid precoding structure and constraint on RF 
precoder F/j, there is no guarantee that a pair (F/j, F^) can be 
found such F* = F/jF^ holds exactly. However, by relaxing 
the strict equality in (|6]l, the matrix decomposition can be 
accomplished through reformulating the original problem as 

min ||F* — F/jF^Hi;’ 

Fb.Fb 

s.f. ||FflFB|||, = 7V„ (10) 

Ffl G J^R. 

To look closely at the physical implication of this problem 
re-formulation, recall that our design objective is essentially to 
approximate F* by the product of hybrid precoding matrices, 
namely F^jF^. Thus a natural question arising at this point 
is how sensitive the spectral efficiency R{Fr,Fb,'Wr, W r) 
to any deviation of FrFr from F*, because small residue 
I |F* — F/jF BI Ij’ at a solution of problem (fTOl i is inevitable and 
this residue would divert the optimal unconstrained combiner 
W* away from the SVD-based solution Ui. 

Bearing the analysis above on mind, we begin the design 
of hybrid precoders by assuming that the iVj.-dimensional 
minimum distance decoding can be performed at the receiver, 
which implies that the achieved spectral efficiency is equiva¬ 
lent to the mutual information over the MIMO channel when 
Gaussian inputs are used, which is given by 


AFr,Fb) 


log2 



+ ^HFrFbF^F^FL 



( 11 ) 

Next, we obtain the hybrid precoders by maximizing the mu¬ 
tual information in (dUl. The problem of mutual information 
maximization problem has been investigated in M, where 
the mutual information is approximated as 


AFr,Fb) 


« log 2 



iV, + ||VfFflFB|| 


2 

F’ 


( 12 ) 


and maxI(Ffl;, Fs) max ||Vf^Ffl;FB|||, is approximately 
equivalent to minimizing ||F* — F/jF^Hf. Consequently, 
designing (F/j, F^) so as to maximize the mutual information 
over the massive MIMO channel can be accomplished by 
solving the matrix decomposition problem (fTOl i. Once the 
hybrid precoders (Fj^jF^) are optimized, we can proceed 
to design the hybrid combiners (W;j,Wb) to maximally 
increase the system’s spectral efficiency. 


The second constraint in (fTOl i requiring that the entries of 
F/j have constant amplitude is evidently non-convex, 
which the use of efficient convex optimization algorithms and 
makes it extremely challenging to secure a globally optimal 
solution. Under the circumstance, our design searches for a 
near-optimal solution so that the spectral efficiency achieved 
by the obtained hybrid precoders (as well as the hybrid 
combiners) is comparable with the upper bound R. The design 
method described below has three main ingredients: it employs 
an alternate optimization strategy that separates the two sets of 
design parameters in a natural manner; a local convexification 
technique ensures that each sub-problem be solved in a convex 
setting; and the use of a carefully chosen initial point that 
facilitates the alternate iterates to converge to a satisfactory 
design. 

Alternate minimization is an iterative procedure with each 
iteration be carried out in two steps. In each of these steps one 
set of design parameters are fixed while the objective function 
is minimized with respect to the other set of parameters 
and the role of design parameters alternates as the design 
step switches. For the design problem at hand, naturally the 
components in F/{ and those in Fs are the two parameter sets, 
the alternate minimization is performed as follows: 1) solve 
problem (fTol i with respect to F^ with F/j given; and 2) solve 
problem (fTOl i with respect to F/j with F b given. 

We begin by examining a simplified version of problem 
(doll by temporarily removing the normalization constraint 
IlFflFsIll;, = Ns, which leads (fTOl i to 


min ||F* — F/jFsIli;’ 
Fh,Fb 

s.t. Fi^ G Rr. 


(13) 


Denote the hybrid precoders at the fc-th iteration by 


(fc) 
R ’ 




■ B^)^ ^nd assume the initial F^^ is given. We up- 

(fe) 


(F 

date F^^ by solving the unconstrained convex problem 
minFB I |F*—F^^F^^ | |f whose closed-form solution is given 


by 






fc = 0, 


(14) 


In turn, we update the RF precoder to by solving the 

(k) 

non-convex problem below while F^ is given as a constant 
matrix: 


-,(fc + l) 


I'C'* _■p(^) I 

1^ ^ R ^ B I 


s.t. F 


(k+i) 

R 


(15) 


G Rr. 


To deal with the nonconvex constraint F^^^^ G Rr in (fTSl l. 

(k) (k) 

we update F ^' with a local search in a small vicinity of F ^'. 
Denote the phase of the (m, n)-th entry of F^^ as F^^ 

can be represented as as m = 1, • • • , Nt, n = 

l,--- ,Mt. To characterize the relation between F^ ’ and 

T'(^) * 4 . 

we write ^ as 


p(fc+i) 


Vn; 


{e- 




(fc+i) 


} = 


y/Wt 


(16) 


where is the phase increment of the (m, n)-th entry of 
F^^ Note that the approximation ~ 1 + jSm'^n holds 
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as long as \Sm'!n\ is sufficiently small, e.g. \Sm'!n\ < 0.1. Based 
on Taylor’s expansion, therefore, we have 

1 


R 




') 


= F 


(fc) 

R 




~ ^ R 




VTTt 


pj^' 


(fc) 


}, 


where {Sm}n} is the matrix whose (m, n)-th entry is S. 
and “o” denotes the Hadamard product (entrywise product). 
It follows that the problem in ( fTST i for seeking can 

be reformulated as an optimization problem with respect to 

{(5m.n} as 


(17) 


(fe) 


mm 


min 

l^m,n j 


F*- F 


p(^‘) 

R 




) F 


(fc) 


y/Wt 




} F 


(fc) 


(18) 

where = F* — F^^F^^ We remark that problem (fTsT l 
has a convex quadratic objective function, and the constant 

S J^R has been into account 


amplitude constraint 


because here assumes the form of 

However, the above formulation is based on the approximation 


p]{<k' 


(fc+i) 




~ 1 + jSmln, hence it is valid only if is suffi¬ 

ciently small. Therefore, linear constraints on the smallness of 
|^m,n| need to be imposed, thus problem (fTSl l is modified to 


r(fc) 


r(fc) 


{<5m,n} 


O ) F 


p(^) 


(19) 


s-t- 




1 -f 


where > 0 is sufficiently small such that 
jSm'!n holds. Problem ( fT9] l is a convex quadratic programming 
(QP) problem whose unique global solution can be calculated 
efficiently Il25l . Once the solution {bm,™} is obtained, the 


■,(fc+i) 


can be updated by (fTbl l. 


There are several issues that remain to be addressed. These 
include defining an error measure to be used in the algorithm’s 
stopping criterion and elsewhere; selection of a good initial 
point to start the algorithm; adaptive thresholding for phase 
increments Sm'n and derivation of an explicit formulation for 
problem (fT9l i; and a treatment of the constraint IlFfiFsIll’ = 
Ng in problem ( fTOl i. 

1) An Error Measure: The relative distance between F* 
and F^^F^^, namely £k = 

as an error measure. In the proposed algorithm, alternate 
iterations continue until \£k — efc-i| falls below a prescribed 
convergence tolerance e, and when this occurs, the last iterate 
(F^\f^^) is taken to be a solution of problem (fOl l. 

2) Adaptive Thresholding for Phase Increments 6m^n- The 
constraints on the magnitude of phase increment in ( fT^ limit 

to within a small neighborhood of that usually 


j,(fe+i) 


J. ^ LW WlLlllll Cl olllClll IIWWVJ vjl J. 

affects the algorithm’s convergence rate. This is however less 
problematic for ( fT9] l because the effective range for each phase 


parameter in the RF precoder is limited to [0, 27r). In addition, 
the issue can be addressed by making the upper bound 
(threshold) in (fT^ adaptive to the current error measure so as 
to improve the algorithm’s convergence rate. The adaptation 
of threshold is performed as follows: 

1 ) set slightly larger than if Sk is far greater than 

e, and Sk < £k-i holds; 

2 ) set a smaller than if Sk is close to e, or Sk > 

£k-i holds. 

Scenario 1) allows a larger phase increment while the algo¬ 
rithm converges in the right direction (Sk is decreasing), while 
scenario 2) reduces to a smaller when Sk increases 

due to that the previous large phase increment has made the 
approximation ~ I+j5m}n invalid, or when £k is close 

to the required e suggesting that higher precision is required. 
In Section|IV]we shall come back to this matter again in terms 
of specific adjustments on 

3) Re-formulation of Problem ( 1791) .' Another issue concern¬ 
ing problem (fT9l l is that its formulation in terms of Hadamard 
product is not suited for many convex-optimization solvers 
that require standard and explicit formulations. Denote the p- 
th row of by a^p\ we can write the objective function 
in (fT^ as 


Nt 

E 


qf - 


Nt 

= y^ 

\ \ p 



1 

p((=) 

_VNt ’ 


^ B 


where A 


.7 

VN~t 


diag 


(^) _ r^(^) • 

P — ^p, 2 5 “ * 5 >,iVt 

(fc) 


and 


( 20 ) 


^ gJ'f'p.wt J F^^ hence 
Nt 




fk) 


in V 


min 


Nt 


( 21 ) 


= E^ 


,(fc) 


p=i ' 

It follows that problem (fT^ can be solved by separately 
solving Nt sub-problems 


min 

A(fc) 


n(^) — A(k)rs(k) 
H.r> 


s.t. I4 ^)|< jW,Vn, 


( 22 ) 


for p = I, 2, • • • ,Nt. Note that each problem in (l22li is an 
explicitly formulated convex QP problem to which efficient 
interior-point algorithms apply ll25ll . 

4) Choosing an Initial Point: Choosing an appropriate 
initial point to start the proposed algorithm is of critical 
importance because the original problem (flOl) is a non-convex 
problem which typically possesses multiple local minimizers. 
As far as gradient-based optimization algorithms are con¬ 
cerned, the likelihood of capturing global minimizer or a good 
local minimizer is known to be highly dependent on how close 
the initial point to the desired solution. 
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Note that the objective function in (fTOl i. namely ||F* — 
measure the difference between the optimal un¬ 
constrained RF precoder F* and an actual decomposition 
F/jFb in the feasible region. If we temporarily neglect the 
constant amplitude constraint on F/j, the perfect decompo¬ 
sition of F* can be performed through SVD decomposition 
F* = This motivates us to construct an initial 

point based on the SVD of F*. As F* comes from the first 
Mt right singular vectors of the channel matrix H, F* has 
the full column rank, which means all Ns entries along the 
diagonal of Sf are non-zero. Note that UfSf is an Nt x Ng 
matrix with full column rank, V|f is an x Ng matrix and 
Ff consists of Nf'^ columns. To construct an initial point that 
conforms to the dimensions of F^ and F^, we generate an 
Nt X {Mt — Ns) matrix F^ where the amplitude of each entry 
is equal to and the phase of each entry obeys a uniform 
distribution over [0, 2tt). In this way, a decomposition of F* 
is found to be 


F* = [UfSf Fr] 


Vf 

0 


(23) 


and (Ff = [UfSf Ff], Ff = [Vf 0 ]^) is exactly a 
global solution for min||F* — FfFf|| when no constraints 
are imposed. We stress that Ff = [UfSf Ff] is infeasible 
when the constant amplitude constraint of Ff is imposed. 


r(o) 


that 


Nevertheless, we can select a feasible initial point F]j 
is close to the above [UfSf Ff] by modifying the first 
partition UfSf as follows: 

1 ) retaining the phases of all entries in UfSf; 

2 ) enforcing the amplitudes of all entries in UfSf into 
^L= to make F^ feasible. 

Since the modified UfSf still incorporates the information 
of the phases in decomposition (l23l l. it is intuitively clear that 
the F^^ generated above is reasonably near the global solution 
of problem (fOl l and for this reason we shall chose it as the 
initial point for the proposed algorithm. 


Finally, the constraint ||FfFf||f = in the original 
matrix decomposition problem (flOl i is treated by performing 
a normalization step where Ff is multiplied by 


yivT 


.IFfiFsIjp ■ 

The normalization assures that the transmission power remains 
consistent after precoding. A step-by-step summary of the 
hybrid precoder design is given below as Algorithm [T] 


B. Hybrid Combiners Design 

The hybrid precoders are designed under the assumption 
that the W^-dimensional minimum distance decoding can be 
performed at the receiver. However, such a decoding scheme 
is difficult to implement in practice due to its high complexity. 
In this paper, we employ linear combining at the receiver. As 
we are aware, if the hybrid precoders would be equivalent 
to the unconstrained optimal precoder F* = Vi, the optimal 
unconstrained combiner W* would be Ui. However the error 
||F* — FfFfIIf can never be absolutely zero, hence Ui 
deviate from the optimal unconstrained combiner W*. The 
linear MMSE combiner 'Wmmse will achieve the maximum 
spectral efficiency when only linear combination is performed 
before detection and only 1-dimensional detection is allowed 


Algorithm 1 The Hybrid Precoders Design via Matrix De¬ 
composition based on Alternating Optimization 


Require: F*, F 


( 0 ) 


1 : F^^ = (F 




F^r)~^F^r^F* 

£0 = ^- IIF^IIp ^ ^-1 = oo 

k = 0 

while \ek — efe-i| < e do 
k = k + 1 


obtain F^^ by solving (ffSl l 


B 


= (F 

F*-F 


(^■) \ 


R 


H 


£k = 

end while 

p _ VWFb 
IIFbFbJIf 

return Ff, Ff 


for each data stream. The unconstrained linear MMSE com¬ 
biner is given in li^ as 


= W MMSE 


= argmin E[||s-Wy||2] 


Ng 



TUT? T? 

Rt B^ 



-1 

HFfFf. 


(24) 

Once W* is obtained, the alternate optimization method 
presented above can be directly applied to decompose W* into 
hybrid combiners Wf and Wf, which leads to the problem 


min ||W*-WfWf||f 

Wh.Wb 

s.t. Wf £ Wf- 


(25) 


We will evaluate the performance of the proposed hybrid 
processing scheme through simulations in Section HVl 


C. Approach To Waterfilling Spectral Efficiency 

To reach a capacity-achieving processing scheme, the wa¬ 
terfilling power allocation should be applied to the precoder. 
In this case, the optimal unconstrained precoder and combiner 
in Section [III] are updated to F* = ViF and W* = Ui 
respectively, where F is a diagonal matrix that performs the 
waterfilling power allocation. The precoder so produced can 
directly be decomposed through Algorithm [T] However, there 
may be cases where no power is allocated to some data 
streams corresponding to the lowest singular values of H, 
especially when the SNR is small. In other words, we may 
end up with F* = [F',0], where F' is the non-zero columns 
of F* = ViF after waterfilling power allocation. In this 
case, we can apply the MD-HP scheme to the F' part first, 
F' = FfF'^. And then the whole decomposition for F* is 
given by F* = [F',0] = [FfF'5 ,0 ] = Ff[F'5 ,0 ] = FfFf. 
In this way, the zero-power allocation part is realized through 
the baseband precoding rather than the phase shift in the RE 
domain. 
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D. Quantized RF Phase Control 

It is difficult to assign arbitrary value to the phase of each 
entry in the RF precoder F/j or combiner is difficult to 
be set to be an arbitrary value due to the limited precision in 
the practical implementation. To address the problem here, we 
also introduce the quantized phase implementation of F/{ and 
Wrj. Assume the phase of each entry in F^j and Wr can 
be quantized up to L bits of precision by choosing the closet 
neighbor based on the shortest Euclidean distance, which is 
given by 

‘/Trn. 

(26) 


2TTn 


where n = argmin. 


nG{0,--- ,27'-!} 


27771 I 


IV. Simulation Results 

In this section, we report the results of the simulations 
conducted, where the convergence of the proposed matrix 
decomposition method based on alternate optimization are ex¬ 
amined and the performance of the proposed MD-HP scheme 
are evaluated. 


A. Convergence Properties of Algorithm Q] 

Before we apply Algorithm [T] to design the hybrid precoders 
and combiners, it is necessary to examine whether it will 
converge to a level where the error ek is acceptably small, 
this is because the original optimization problem (fT^ to be 
solved is non-convex and there is no guarantee that Algorithm 
[T] will certainly result in a satisfactory matrix decomposition. 

We took a 256 x 64 MIMO system as example, and set 
Ng = 4, Mt = 6 . An i.i.d Rayleigh fading channel matrix 
with each entry obeying 1 ) was randomly generated. 

From Section IIII-AI the optimal unconstrained precoder F* 
was obtained by selecting the first Ng right singular vectors 
based on the SVD decomposition on H^;. An initial RF pre¬ 
coder F^^ was chosen by following the technique described 
in Section IIII-A4I The threshold was set to e = 10“® and the 
first phase increment threshold was set to = 0.1. In the 
simulations, two options for 5^^'^ were examined: 

1) = 0.1, Vfc; _ 

21 when \ek-i - ek- 2 \ > 100 • e 

\ 0.8 • when |£:fe_i — e:fc_ 2 | < 100 • e 

For option 2) with adaptive phase increment threshold, the 
adjustment of depends on how close the previous two 
error indicators are. When the difference of the previous error 
indicators is smaller than 100 • e, which means Algorithm [T] 
is going to converge, should be reduced to enhance the 
precision of the solution by guaranteeing the effectiveness of 
the approximation Ri 1 -|- jSm^n- Otherwise, can 

be augmented to accelerate the algorithm by enlarging the 
feasible region of (fT9] l. Moreover, we need to decrease 
whenever Sk-i > £k -2 which means the previous is too 

large to guarantee « 1 -f j5m]n- We restricted £ 

[0.1, 0.5] by clamping to 0.1(0.5) when it was smaller 
(larger) than 0 . 1 ( 0 .5) in case that the feasible region for ([T^ 
was too small or too larg^ 

^All parameters given in this section can be revised for other specific cases 


To examine the effectiveness of approximation = 

gl(<#’L(n+^L,n) ~ ( 1 -f we compare the traces of 

and (1 + within 100 iterations in 

Fig. m where the red dash line indicates the unit circle on the 
complex plane. It is observed that the points of the two traces 
(m = 1, n = 5) update simultaneously and two corresponding 
points remain very close, which suggests that the iteration 
' = gl('^l>(n+'5L,n) jnay be regarded as a linear opera¬ 
tion over Sm'jn- By performing adaptive 5*-^^ updates 

with relatively larger step size at the beginning when the iterate 
is far from the solution g-fi oose 0.5381—jO.8429, and then 

gradually gets close to it. In Fig. |3 we show how the error 
measure converges to about 0.2 as the number of iterations 
increases when the adaptive and constant are applied 
respectively. It can be observed that the adaptive threshold 
helps the algorithm converge more quickly because it allows 
the algorithm to conduct a search over a larger part of the 
feasible region when the error Sk is relative small. The above 
parameters will also be used in the next simulations. 


0.2 - O Itace of 

□ Ttace of (1 + jSk)e^'^ 


0 - 

I 

9 



-0.2 0 0.2 0.4 0.6 0.8 1 1.2 

Fig. 2. The traces of and on the complex 

plane. 



Fig. 3. The convergence performance of Algorithm ^ when applying the 
adaptive and constant 5^^^ respectively 


















B. Spectral Efficiency Evaluation 

In this part of simulation section, we illustrate the spectral 
efficiency performance of the proposed MD-HP scheme by 
comparing it with several other options under large i.i.d. 
Rayleigh channel and mmWave channel settings respectively. 
The SNR 7 = Jr range was set to be from -40 dB to 0 dB in 
all simulations. 

1) Large i.i.d Rayleigh Fading Channels: The MD-HP 

scheme is compared in Fig. |4] against the optimal uncon¬ 
strained SVD based processing scheme when Ng = 8 data 
streams are transmitted in a 256 x 64 massive MIMO system. 
For the MD-HP scheme, the situations of using 8 and 12 
RF chains (along with their quantized versions) are examined. 
When 12 RF chains are implemented at both the transmitter 
and receiver, the performance of the MD-HP scheme is near- 
optimal compared with the optimal unconstrained SVD based 
scheme. Even though we reduce the number of the RF chains 
to the number of the data streams, namely, 8 RF chains 
are employed, the spectral efficiency achieved by the MD- 
HP scheme slightly decreases by around 3 bps/Hz. As for 
the heavily quantized versions (L = 2 bits with the phase 
candidates {0, ^ 7 r, 7 r, | 7 r}) corresponding to the 8 and 12 RF 
chains settings, the spectral efficiency suffers less than 2.5 
dB loss (from the view of SNR). Fig. |5] further demonstrates 
the spectral efficiency performance by also setting the number 
of transmit data streams to 4 while 8 RF chains are used. 
Compared with the case of 4 transmit data streams, the 
performance of the 8 data stream case is evidently improved 
thanks to the multiplexing gain. Notably, there is a small 
gap between the MD-HP scheme and the SVD based scheme 
which can be eliminated by properly increasing the number of 
RF chains, e.g., double the number of the data streams in the 
case of Ng = 4. In addition, the quantized versions {L = 2) 
also results in 2.5 dB loss in performance. Under a critical 
condition that the numbers of RF chains of the transmitter and 
receiver are set to Mt = = Ng, Fig. | 6 ] shows the spectral 

efficiency of the above schemes with Ng = 2,4 and 8. It is 
observed that the MD-HP scheme (including the quantized 
version) consistently remains close to the optimal spectral 
efficiency as Ng increases, which implies that the MD-HP 
scheme can probably achieve the near-optimal performance 
even when a large number of data streams are conducted. 

2) Large mmWave Channels: Our proposed MD-HP 
scheme can also be applied to the large mmWave channels 
where a certain number of hybrid processing schemes have 
been studied in the literature. In simulations, the clustered 
mmWave channel model © was adopted to characterize its 
limited scattering feature. Apart from the unconstrained SVD 
based processing and our MD-HP schemes, we employ the 
spatially sparse processing IfMl which designs the hybrid 
precoders/combiners by capturing the characteristics of the 
dominant paths. The propagation model mainly follows the 
settings in HI: 1) the mmWave channel incorporates Nc = 8 
clusters, each of which has Np = 10 paths; 2 ) the transmitter 
angle sector is assumed to be 60°-wide in the azimuth while 
the receiver with a smaller omni-directional antenna array; 3) 
the angle spreads of the transmitter and receiver ag and 



Fig. 4. Spectral efficiency achieved by different processing schemes of a 
256 X 64 massive MIMO system in i.i.d. Rayleigh fading channels where 
TVs = 8 data streams ai‘e transmitted through 8 and 12 RF chains respectively. 



Fig. 5. Spectral efficiency achieved by different processing schemes of a 
256 X 64 massive MIMO system in i.i.d. Rayleigh fading channels where 
TVs = 4 and 8 data streams are transmitted thi'ough 8 RF chains respectively. 

are all set to be 7.5°; 4) the antenna spacing d is equal to 
half-wavelength. In Fig. |7] the spectral efficiency performance 
is demonstrated in a 256 x 64 mmWave MIMO system, where 
Ng = 8 data streams are transmitted through 8 or 12 RF 
chains. Our proposed MD-HP scheme apparently outperforms 
the spatially sparse processing scheme when the same number 
of RF chains are implemented. Moreover, the MD-HP scheme 
can even achieve higher spectral efficiency with only 8 RF 
chains than the spatially sparse processing scheme with 12 RF 
chains. Particularly, the SVD based processing is sufficiently 
approached by the MD-HP scheme given 12 RF chains. It is 
shown that our proposed MD-HP scheme can better capture 
the characteristics of the mmWave channel than the spatially 
sparse processing scheme. 

V. Conclusion 

In this paper, we have designed the hybrid RF and baseband 
precoders/combiners for multi-stream transmission in P2P 
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Fig. 6. Spectral efficiency achieved by different processing schemes of a 
256 X 64 massive MIMO system in i.i.d. Rayleigh fading channels where 
Ns = 2,4 and 8 data streams are transmitted respectively and the numbers 
of RF chains are set to Mt = Mr = Ns ■ 



Fig. 7. Spectral efficiency achieved by different processing schemes of a 
256 X 64 massive MIMO system in mmWave channels where Ns = 8 data 
streams ai‘e transmitted through 8 and 12 RF chains respectively. 

massive MIMO systems via solving a non-convex matrix 
decomposition problem. Based on an alternate optimization 
technique, we have transformed the non-convex matrix de¬ 
composition problem into a series of convex sub-problems. 
Careful handling of the phase increment of each entry in 
RF precoders and combiners in each iteration and smart 
choice of an initial point have allowed our algorithm to 
yield near-optimal solution with high probability. The MD- 
HP scheme can be applied to any general massive MIMO 
channels such as i.i.d. Rayleigh fading channels and mmWave 
channels. By providing enough number of RF chains (e.g., 
double the number of the transmit data streams), the pre¬ 
designed unconstrained digital precoder/combiner of a large 
dimension can be sufficiently approached and thus the near- 
optimal performance is achieved. A low quantization level 
such as 2 bits for phase shifters has been shown to lead to 
around 2.5 dB loss in performance. We aim to incorporate 


channel estimation and reduce the time complexity of the MD- 

HP scheme in the future. 
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